Distinct activation mechanisms regulate subtype selectivity of Cannabinoid receptors

Design of cannabinergic subtype selective ligands is challenging because of high sequence and structural similarities of cannabinoid receptors (CB1 and CB2). We hypothesize that the subtype selectivity of designed selective ligands can be explained by the ligand binding to the conformationally distinct states between cannabinoid receptors. Analysis of ~ 700 μs of unbiased simulations using Markov state models and VAMPnets identifies the similarities and distinctions between the activation mechanism of both receptors. Structural and dynamic comparisons of metastable intermediate states allow us to observe the distinction in the binding pocket volume change during CB1 and CB2 activation. Docking analysis reveals that only a few of the intermediate metastable states of CB1 show high affinity towards CB2 selective agonists. In contrast, all the CB2 metastable states show a similar affinity for these agonists. These results mechanistically explain the subtype selectivity of these agonists by deciphering the activation mechanism of cannabinoid receptors.

C annabinoid receptors (CB 1 and CB 2 ) were first discovered as a target of phytocannabinoids in the last decades of the twentieth-century 1,2 . Important physiological and psychological roles of these receptors were soon perceived by the elucidation of the endocannabinoid signaling system [3][4][5] . CB 1 is majorly expressed in the central nervous system and has a role in appetite control, neuroprotection, and neurotransmission 6,7 . CB 2 is majorly expressed in the immune system; targeted for immunomodulation and inflammation 6,8 . These receptors belong to the lipid subfamily of class A GPCRs [9][10][11][12][13][14][15][16][17] .
Class A GPCRs are recognized by conserved motifs which undergo structural changes during activation of the protein and help to transduce intracellular signaling by β-arrestins, and G-proteins [18][19][20] . GPCR are targets of more than 30% of FDAapproved drugs [21][22][23] . Similarly, academic labs and pharmaceutical companies have led drug discovery efforts to develop molecules targeting cannabinoid receptors [24][25][26][27] . Earlier discovery campaigns resulted in molecules structurally similar to the known phytocannabinoids and endocannabinoids, which was followed by generation of chemically diverse molecules 28 . Most of these ligands bind to both the receptors and lead to non-specific side effects [29][30][31] ; only a few designed ligands are selective to CB 1 or CB 2 32 . Specifically, CB 2 selective ligands can be used to explore the biology of CB 2 , which has been described as "a cannabinoid receptor with an identity crisis" 33,34 . Furthermore, CB 2 selective agonists hold a promise to be an important drug target for pain and inflammation without the psychoactive side effects caused by the binding to CB 1 . However, an explanation of mechanistic underpinning for these ligands' selectivity is still missing. Understanding subtype selectivity between CB 1 and CB 2 would be important for selective drug design [35][36][37] .
Lack of subtype selectivity is also a major issue in drug design for other subfamilies of class A GPCRs. For example, the main targets of ergotamine are 5-HT 1B and 5-HT 1D receptors; however, its usage is very limited due to off-target actions on the other 5-HT subtype receptors, particularly on the 5-HT 2B receptor that causes life-threatening cardiovascular disorders 38 . Structurebased approaches like docking studies or data-driven methods like machine learning have been employed to tackle the subtype selective drug design for class A GPCRs [39][40][41][42][43] . Large-scale docking studies have been performed where ligands were docked into the static crystal structure or homology models to explain specific interactions crucial for subtype selectivity. For instance, important protein residue difference in TM5 of 5-HT 1B and 5-HT 2B was found to be an essential factor in drug design 44 . However, docking does not take into account the large structural changes in the receptor binding pocket. Therefore, docking might not reveal the correct binding poses for ligands, especially when the binding pocket undergoes a significant volume change. Previous crystal structure studies show that selectivity between A 1 -AR and A 2A -AR appears due to structural rather than sequence changes 45 . On the contrary, machine learning models (e.g., LiCABEDS) have provided interpretability about the ligand scaffold that might lead to subtype selectivity. However, these ML models fail to consider receptor-ligand interactions, which leads to high false positive rate in their predictions 46,47 . In the last few years, crystal structures of inactive and active states of CB 1 and CB 2 were determined, which revealed ligand binding poses, and important structural changes during the activation of CB 1 and CB 2 (Supplementary Table 1). Comparing the antagonistbound pose of inactive CB 1 and CB 2 shows the distinction between binding poses of antagonist of CB 1 and CB 2 ( Supplementary  Fig. 1a). However, structural comparison of agonist bound pose of unselective ligands of CB 1 and CB 2 shows that binding pocket residues have high structural and sequence similarity (Supplementary Fig. 1b, e). Docking performed using the experimentally determined structures did not reveal differences in the binding poses nor could determine subtype selectivity for CB 1 and CB 2 9,11 . Furthermore, during the activation process, the CB 1 binding pocket undergoes large conformational changes due to the upward movement of the N-terminus, whereas CB 2 pocket retains its overall shape ( Supplementary Fig. 1c, d). Considering these two factors, we hypothesized that CB 2 selective agonists conformationally select for the CB 2 pocket shape. As the pocket shape remains similar through out the activation process, these ligands have the higher possibility of binding to the CB 2 rather than CB 1 , where the pocket shape and volume changes significantly during the activation thereby decreasing the overall binding affinity (Fig. 1a, b).
Here, we compared the activation mechanism for CB 1 and CB 2 using molecular dynamics simulations to obtain their entire conformational ensemble. Using 700 μs of aggregate unbiased simulation data, we deciphered the similarity and distinction of activation mechanism between CB 1 and CB 2 . Analyzing the data using markov state model, we found important allosteric communications between different structural motifs, which facilitates the receptor activation. By implementing deep learning based VAMPnets architecture, we discretize the conformational space into different metastable states 48 . Docking of the CB 2 selective ligands shows the docking affinities of these ligands are similar for all the metastable states in CB 2 whereas these ligands bind to only specific metastable states in CB 1 . Binding of CB 2 selective ligands favorably to only specific metastable states might explain the subtype selectivity of the ligands. This mechanistic understanding of ligand selectivity between CB 1 and CB 2 will aid in the design of new subtype-selective drugs for cannabinoid receptors.

Results and discussions
Structural comparison of CB 1 and CB 2 crystal structures. In the last few years, a plethora of CB 1 and CB 2 structures are reported in different conditions as shown in Supplementary Table 1. A comparison of these crystal structures shows major structural changes in the receptors' extracellular, transmembrane, and intracellular regions during the activation. As both these receptors belong to the lipid subfamily of class A GPCRs, some of these changes are conserved between CB 1 , and CB 2 9,11,12,16 . For example, both the receptors undergo large conformational changes in intracellular TM6 towards the outward direction, as shown in Fig. 2. Projection of intracellular TM6 movement (R 3.50 -K 6.35 distance) feature on one dimensional space shows that this feature can distinguish active and inactive structures for both CB 1 and CB 2 ( Fig. 3 and Supplementary Fig. 2 and Supplementary Note 1). Similarly, the sidechain of conserved Y 7.53 in the NPxxY region of intracellular TM7 moves towards TM5 during the full activation of the receptors (Fig. 2). Projection of intracellular TM7 distance (I 5.54 -Y 7.53 distance) also shows that this feature changes during activation of CB 1 and CB 2 (Fig. 3). Therefore, these two conserved intracellular features are used as metrices to judge whether a structure is active or inactive. However, other conformational changes are not conserved between the CB 1 and CB 2 . Structural overlap of a representative active and inactive CB 1 and CB 2 structures show the following non-conserved changes in CB 1 and CB 2 .
(1) For CB 1 , in the extracellular region, N-terminus shows large conformational differences in agonist bound structure as compared to the antagonist bound structures (Fig. 2a). Agonist and antagonist bound poses are also different for CB 1 11,13 . Agonist binding lifts the N-terminus above the orthosteric pocket. The N-terminus distance (M N-term -D 2.50 distance) from the ligand binding pocket clearly shows the difference in the agonist bound structures as compared to the antagonist bound structures (Fig. 3a). As agonist binding leads to the activation of CB 1 , active and inactive structures have distinct N-terminus feature values. The exception to this rule is the agonist and Negative Allosteric Modulator (NAM) bound structure (PDB ID: 6KQI 14 ), where the structure remains inactive because of NAM binding. On the other hand, in the active and inactive structure, N-terminus always remain above the orthosteric pocket; therefore, N-terminus feature values do not distinguish between active and inactive crystal structures for CB 2 (Figs. 2b, 3b).
(2) In CB 1 , due to the extended structure of the antagonist, TM1 moves outwards towards the membrane in the inactive structures (Fig. 2a), whereas extracellular TM1 conformation of CB 2 stays relatively similar in the active and inactive state in the protein (Fig. 2b). The higher variation in extracellular TM1 movement between active and inactive structures in CB 1 as compared to CB 2 is shown in (Fig. 3a, b).
(3) Another major difference in activation of CB 1 and CB 2 is observed in toggle switch residue movement. For CB 1 , twin toggle switch residues (W 6.48 and F 3.36 ) undergo translational changes (Fig. 2a). During the activation, W 6.48 moves towards the TM5, which leads to the displacement of F 3.36 towards TM2. The displacement is captured by the difference in the center of mass of the sidechain of these two residues in the z-direction (Supplementary Fig. 2). In the case of CB 2 , F 3.36 does not undergo conformational change (Fig. 2b). Instead, agonist binding to CB 2 has shown to rotate the toggle switch residue W 6.4816 . This rotational motion is captured by calculating the χ 2 angle of W 6.48 . Projection of translational and rotational movement features of toggle switch show that former feature could capture the distinction between active and inactive structures for CB 1 and latter for CB 2 (Fig. 3a, b).
(4) For CB 2 , inactive structures in intracellular TM5, the sidechain of Y 5.58 faces towards the membrane and creates more space for TM6 to move inside in the inactive structure. In the active structure, Y 5. 58  Thermodynamic relevance of structural features in cannabinoid receptors' activation. All extracellular (N-terminus and TM1 movements), transmembrane (Toggle Switch translation and rotation), and intracellular features (TM5, TM6 and TM7 movements) that distinguish active and inactive states of the receptors identified from different crystal structures may not be thermodynamically and kinetically relevant (Fig. 3). Experimentally determined side chain conformations occasionally do not have good resolutions. Hence, it may over or under emphasize some of the conformational changes. Furthermore, features that do not distinguish between active and inactive state for respective proteins (e.g., Toggle switch rotation for CB 1 ; TM1 movement for CB 2 ) may obtain distinct stabilized values in intermediate states during activation. Therefore, to check the thermodynamic relevance of each feature in CB 1 and CB 2 conformational ensemble, we compared the apo (without ligand bound) and holo (with ligand bound) simulation of both proteins.
Compairing~20 μs of unbiased MD simulation for agonist bound active and antagonist bound inactive states of both CB 1 and CB 2 (Supplementary Table 1), it is observed that the conformational ensemble of every feature remains stable   this rule is the extracellular TM1 movement for CB 2 , where ligandbound active and inactive conformational ensembles show overlapping distribution, whereas apo activation conformational changes much wider distribution of feature values ( Supplementary  Fig. 3), which may reveal the newer conformations. Therefore, using our simulation, we observe changes in the conformational features of the proteins that are not deemed important from static crystal structures. Overall, for CB 1 , features that undergoes conformational changes during activations are N-terminus distance, extracellular TM1 movement, toggle switch translational movement, intracellular TM6 and TM7 movements. Contrastingly for CB 2 , conformational changes are observed in extracellular TM1 movement, toggle switch rotational movement, intracellular TM5, TM6 and TM7 movements.
Allosteric communications between CB 1 and CB 2 structural features. In the active state, conformational changes in the intracellular part of GPCR proteins help to transduce signals by secondary messenger (G protein and β-arrestin) 20,50 . These intracellular changes are allosterically linked with extracellular and transmembrane regions of protein 19 . To observe how the conserved and non-conserved changes in CB 1 and CB 2 are allosterically connected, we predicted whether the change in one feature leads to the change in the other features. To estimate this, the absolute difference of P(Feature active |Condition active ) and P(Feature active |Condition inactive ) was determined for every combination of features. These conditional probabilities were measured by discretizing the features based on a threshold that divides the features into active and inactive. The threshold values  Table 1 Comparisons of conformational feature values between apo (without ligand) and holo (with agonist and antagonist) simulations of CB 1 using mean difference and K-L divergence. are taken as an average of active and inactive crystal structures. For the conditional probability calculations, weight of each frame was updated using equilibrium probability distribution estimated using MSM with lag time 25 ns (More details in Method section). Based on the calculations for CB 1 , it shows that N-terminus movement is highly correlated with all the extracellular (TM1 movement), transmembrane (Toggle switch translational movement), and intracellular features (TM6 and TM7 movement), showing the importance of the N-terminus movement in activation ( Fig. 4a and Supplementary Fig. 4a). Projection of MSM weighted free energy landscapes show that with the N-terminus inside the pocket, other features remain mostly in the stabilized inactive state (Supplementary Fig. 5a-d). On the other hand, extracellular TM1 movement is not strongly coupled with transmembrane and intracellular features ( Fig. 4a and Supplementary Fig. 4b). Free energy projection of TM1 movement with respect to intracellular and transmembrane features shows those features are able to obtain active and inactive forms without depending on the TM1 value (Supplementary Fig. 5e-g). Similarly, toggle switch translational movement also shows relatively smaller values of coupling for intracellular TM6 movements and TM7 movements ( Fig. 4a and Supplementary  Fig. 4c). However, the MSM-weighted free energy landscape can explain this relatively smaller value. The free energy plots show an L-shaped landscape between the toggle switch and intracellular features depicting the change to be sequential (Supplementary Fig. 5h-i). Thus, changes in the toggle switch can lead to changes in the intracellular features. Lastly, both conditional probability difference and free energy landscape show the intracellular features are highly coupled with each other, which depicts that the TM6 movement will lead to the TM7 movement (Fig. 4a, Supplementary Figs. 4d, e and 5j). Based on the calculations for CB 1 , it shows that N-terminus movement is highly correlated with all the extracellular (TM1 movement), transmembrane (Toggle switch translational movement), and intracellular features (TM6 and TM7 movement), showing the importance of the N-terminus movement in activation ( Fig. 4a and Supplementary  Fig. 4a). Projection of MSM weighted free energy landscapes show that with the N-terminus inside the pocket, other features remain mostly in the stabilized inactive state (Supplementary Fig. 5a-d). On the other hand, extracellular TM1 movement is not strongly coupled with transmembrane and intracellular features ( Fig. 4a and Supplementary Fig. 4b). Free energy projection of TM1 movement with respect to intracellular and transmembrane features shows those features are able to obtain active and inactive forms without depending on the TM1 value (Supplementary Fig. 5e-g). Similarly, toggle switch translational movement also shows relatively smaller values of coupling for intracellular TM6 movements and TM7 movements ( Fig. 4a and Supplementary Fig. 4c). However, the MSM-weighted free energy landscape can explain this relatively smaller value. The free energy plots show an L-shaped landscape between the toggle switch and intracellular features depicting the change to be sequential ( Supplementary Fig. 5h, i). Thus, changes in the toggle switch can lead to changes in the intracellular features. Lastly, both conditional probability difference and free energy landscape show the intracellular features are highly coupled with each other, which depicts that the TM6 movement will lead to the TM7 movement ( Fig. 4a, Supplementary Figs. 4d, e, and 5j).
For the CB 2 , all the intracellular features are highly coupled with the extracellular TM1 movement ( Fig. 4b and Supplementary Fig. 6a). MSM weighted free energy landscape shows that when intracellular features remains in the inactive states, extracellular TM1 can have a wider range of movement. Whereas, in active state, extracellular TM1 movement is restricted ( Supplementary Fig. 7a-d). From the experimentally obtained structures, it has been shown that toggle switch residue rotates when the agonist molecule binds to the receptor. Therefore, it has been hypothesized that toggle switch rotation caused by agonists leads to intracellular changes in the CB 2 receptor 15,16 . However, our calculations show that toggle switch rotation has minimal effect on the intracellular movement ( Fig. 4b and Supplementary  Fig. 6b). From the free energy projection, it is also clear that intracellular features can adopt both active and inactive forms irrespective of toggle switch rotation feature value (Supplementary Fig. 7e-g). Agonist-bound CB 2 structure (PDB ID: 6KPC) remains in an inactive state as there is no intracellular movement observed, which supports our finding that there is a lack of correlation between toggle switch rotation and intracellular movement 16 . For CB 2 , intracellular TM5 movement is shown to be important for canonical GPCR activation. The free energy landscape shows that L shaped landscape between intracellular TM5 movement with respect to the intracellular TM6 and TM7 movement, which depicts that TM5 movements may be the cause of the activation of the GPCR, not the toggle switch rotation ( Supplementary Fig. 7h, i). Lastly, similar to CB 1 , intracellular TM6 and TM7 movement are highly correlated with each other (Fig. 4b and Supplementary Fig. 7j). Therefore, these analyses show that most of these features (Except for TM1 movement for CB 1 ; toggle switch rotation in CB 2 ), which are physically far away in protein topology, are allosterically influence conformational changes of other features resulting in protein activation.
Activation mechanism for CB 1 and CB 2 . Previous sections show the important conformational changes for CB 1 and CB 2 during the activation. However, individual feature movements or projections of a few features do not reveal the activation mechanism of the receptors and intermediate states involved in the process. To better understand the activation mechanism, we clustered the Table 2 Comparisons of conformational feature values between apo (without ligand) and holo (with agonist and antagonist) simulations of CB 2 using mean difference and K-L divergence.  Table 4 and Supplementary Fig. 12). Whereas, I3 and I4 are structurally similar to active state (Supplementary Table 4 and Supplementary Fig. 12). The structural differences between different metastable states are calculated using K-L divergence based on the closest heavy atom distances between all residue pairs with respect to the inactive state. In Figs. 5 and 6, the calculated K-L divergence is shown as a color bar on the representative protein structure, where the red represents structural divergence. Furthermore, normalized values of the important features (that have been discussed in the earlier section) are calculated for each metastable state as shown in Figs. 5 and 6.
Transition kinetics between the metastable states are shown as mean free passage time (MFPT) if there is a direct transition. Using the above analyses, it is observed that from the CB 1 inactive state, both I1 and I2 transitions are possible. I1 transitions are kinetically more favorable compared to I2. Inactive to I1 transition involves the movement of intracellular TM1 region as shown in K-L divergence analysis and bar plot (Fig. 5). In the case of inactive to I2 and I1 to I2, the transition involves the movement of the N-terminus in the upward direction. Movement of the N-terminus is a slow timescale process as these movements are related to the first eigenvectors of MSM ( Supplementary Fig. 13). Intracellular and transmembrane regions of I1 and I2 metastable states remain similar to inactive states. Conformations of the I2 metastable state are similar to the agonist, and NAM-bound crystal structure (PDB ID: 6KQI 14 ), where the structure remains in the inactive state in spite of the binding of the agonist in the pocket (Supplementary Fig. 14). I1 to I3 transition involves rearrangement of the intracellular and transmembrane features as N-terminus starts to move out of the pocket. K-L divergence analysis and bar plot show that the major difference occurs in Nterminus, toggle switch region, intracellular TM6, and TM7 region ( Fig. 5 and Supplementary Fig. 12). This step is the the kinetically slowest process during the activation. Once the intracellular and transmembrane rearrangement happens in I3, the rearrangement of structural rearrangement happens between I3 & I4 and I3 & active state of the receptor. In the case of the I3 to I4 transition, N-terminus moves slightly further upward the pocket, and further intracellular features movement happens ( Fig. 5). During I3 to active state transition, the N-terminus moves out of the binding pocket, which TM1 to come inside the pocket; however, intracellular features remain relatively the same (Fig. 5). Between these two transitions, the later transition is kinetically faster as I3 to I4 transition involves N-terminus movement (Fig. 5).
For CB 2 , intermediate states I1, I2, and I3 are structurally and kinetically similar to the inactive state of the receptor, whereas I4 is structurally similar to the active state (Supplementary Table 5; Fig. 6 and Supplementary Fig. 15). K-L divergence shows no major structural changes between the intermediate I1 and I2 & inactive metastable states. Therefore, these states can interconvert with a faster timescale between each other (Fig. 6). The inactive and I3 states are separated in tic one space. Intracellular TM6 and TM7 features are correlated in the tic 1 space ( Supplementary  Fig. 16). In this transition, TM6 moves out slightly, whereas Y 7.53 moves towards the TM5 (Supplementary Table 5 and Supplementary Fig. 15). From I3 to I4 and active state movement, this TM6 movement leads the Y 5.58 to move inside the transmembrane region. Both of these transitions are kinetically slower as it involves rearrangement of the intracellular regions of the CB 2 .
MFPT calculations based on transition path theory (TPT) clearly show that the transition between inactive to active state in CB 2 is much faster than CB 1 . A Kinetic monte Carlo (kMC) study on MSM based transition matrices also verifies that inactive to active transition is faster for CB 2 transition ( Supplementary  Fig. 17a, b). kMC analysis also indicates that transition between inactive, I1, and I2 is kinetically much faster for CB 1 , compared to I1 to I3 transition. I1 to I3 transition leads to the protein moving from the inactive to the active region. Similarly, the kMC study verifies that inactive, I1, I2, and I3 are kinetically closer to each other for CB 2 . Therefore, the transitions between the states are relatively faster. Once the transition happens from I3 to active or active like I4, it remains stabilized in active-like proteins. The above analyses reveal intermediate states during CB 1 and CB 2 activation, by which the sequential conformational change Representative structures from six CB 1 metastable states obtained from VAMPnets are projected on tic 1 and tic 2 spaces. Cartoon representations of receptors are colored based on K-L divergence. Red color is representing higher K-L divergence. K-L divergence of each metastable state was calculated compared to inactive metastable state. Transition between metastable states are shown as arrows. Thickness of the arrows represents mean free passage time (MFPT) of the transition which was calculated using transition path theory (TPT). Red dashed line is representing the slowest transition. The unit of MFPT is in microseconds. Bar plots per metastable state shows the normalized values of important structural features, where [1], [2], [3], [4], [5] represent N-terminus movement, extracellular TM1 movement, toggle switch translational movement, intracellular TM6 movement, intracellular TM7 movement respectively. Errors in bar plots were calculated based on bootstrapping. 20 bootstrap samples were prepared by selecting 1000 frames from each metastable states. Bar plots and error bars represent mean and standard deviation of the 20 samples, respectively. Each sample is shown on the plot as black dot.
happens that leads the protein from the inactive to the active state, and the distinction between changes between CB 1 and CB 2 .
We also performed long-timescale MD simulations from the representative structure of every metastable state of CB 1 and CB 2 (12 systems in total) to observe the stability of each state. Simulation length for each system was approximately 600 ns (Supplementary Note 2). RMSD and RMSF analysis (Supplementary Figs. 18 and 20) on simulation data shows that RMSD of each metastable state remains within 4 Å for both CB 1 and CB 2 , where RMSD for CB 1 metastable states are lower than the CB 2 . This can be explained by the timescale of the interstate transition between CB 1 and CB 2 . Transition path theory (TPT) (Figs. 4, 5) and kinetic monte carlo (kMC) simulation ( Supplementary  Fig. 17) on the MSM weighted simulation show that CB 1 interstate transitions are much slower compared to CB 2 . Therefore, CB 2 structures can fluctuate more rapidly compared to CB 1 , resulting higher RMSD. We observe the same phenomena with time independent component (tIC) projection of long timescale MD simulations. The tIC projections were calculated from the distances between the residue pairs calculated based on residueresidue contact score (RRCS) analysis (Method section). The tIC projections for CB 2 show that the simulations started in different metastable states are overlapping with each other within MD timescale ( Supplementary Fig. 19b), whereas simulations of different metastable states of CB 1 remain separated due to larger kinetic barriers (Supplementary Fig. 19a).
We further compared how the conformational changes in the distinct metastable states lead to different allosteric communication networks. Allosteric pipelines in different metastable states were obtained following the analyses described in Bhattacharya and Vaidehi 51 . From these analyses, major distinctions in the allosteric pipelines were observed for CB 1 in the inactive and Representative structures from six CB 2 metastable states obtained from VAMPnets are projected on tic 1 and tic 2 spaces. Cartoon representations of receptors are colored based on K-L divergence. Red color is representing higher K-L divergence. K-L divergence of each metastable state was calculated compared to inactive metastable state. Transition between metastable states are shown as arrows. Thickness of the arrows represents mean free passage time (MFPT) of the transition which was calculated using transition path theory (TPT). Red dashed line is representing the slowest transition. The unit of MFPT is in microseconds. Bar plots per metastable state shows the normalized values of important structural features, where [1], [2], [3], [4], [5] represent extracellular TM1 movement, toggle switch rotational movement, intracellular TM5 movement, intracellular TM6 movement, intracellular TM7 movement respectively. Errors in bar plots were calculated based on bootstrapping. 20 bootstrap samples were prepared by selecting 1000 frames from each metastable states. Bar plots and error bars represent mean and standard deviation of the 20 samples, respectively. Each sample is shown on the plot as black dot.
inactive intermediate states (I1 and I2) compared to active and active intermediate states (I3 and I4) (Supplementary Fig. 21). In inactive, I1, and I2 states, extracellular TM1, and TM2 are allosterically connected to ECL2 via N-terminus. The downward movement of the N-terminus can be explained to be responsible for these allosteric pipelines as there is more interaction between N-terminus and ECL2. There are fewer pipelines connecting extracellular and intracellular domains. In contrast, major pipelines are observed connecting these two domains in active and active intermediate states. Especially all these three states have an allosteric pipeline through TM6, which indicates the outward movement of TM6 helps to create an allosteric path. On the other hand, for CB 2 , no such clear distinction is observed in allosteric pipelines between active like (active, I4) and inactive like states (Inactive, I1, I2, and I3). Most of the metastable states contains extracellular to intracellular communications (Supplementary Fig. 22). This contrasting network pattern in CB 2 compared to CB 1 can be explained by the lack of the major extracellular conformational changes in CB 2 . Hence, these analyses show how conformational changes and the consequent change in allosteric communications facilitate receptor activation.
Explaining selectivity of Cannabinoid receptors. Ligand binding to a receptor can be explained using two distinct mechanisms, i.e., conformational selection and induced fit 52 . In the induced fit mechanism, ligand binding leads to conformational changes in the receptor. Without the ligand, it is kinetically and thermodynamically hard to reach the bound conformation of the receptor. On the other hand, in conformational selection mechanism, the receptor can adopt many possible conformations. However, the ligand preferentially binds to a particular protein conformational state.
As active and inactive binding pocket volumes of experimentally obtained CB 2 structures are relatively similar, contrasting to the dissimilarity between CB 1 structures, we hypothesized that the CB 2 pocket shape remains similar throughout the activation process. If so, then the binding of CB 2 selective agonists would demonstrate a conformational selection mechanism in preference to this generalized binding pocket conformation seen during CB 2 activation. On the contrary, the CB 1 binding pocket shape changes with the movement of the N-terminus during the activation. Therefore, these selective ligands can only bind to certain metastable states for CB 1 which might have similar pocket characteristics as of CB 2 , decreasing the overall affinity. To test this hypothesis, we first calculated the binding pocket volumes of the metastable states during CB 1 and CB 2 activation. Our analyses show that CB 1 metastable states have different binding pocket volumes based on the position of the N-terminus (Fig. 7a). As the N-terminus remains completely inside the pocket during inactive and I1 states, agonist binding volume is comparatively smaller to the other four metastable states. In the case of CB 2 , the binding pocket volume does not change during the activation due to a lack of N-terminus motion (Fig. 7b). Here our results illustrate how CB 2 maintains a competent conformational state for selective ligand binding throughout all stages of activation.
To find whether CB 2 selective agonist binding affinity depends on the binding pocket shape, we docked CB 2 selective ligands into CB 1 and CB 2 metastable states (described in the method sections). Four CB 2 selective ligands (JWH-133, HU-308, JWH-015, AM-1241) are selected from the literature 32 , which shows at least tenfold selectivity (Supplementary Fig. 23). Docking results of CB 2 selective ligands to CB 1 show that docking affinity differs for different metastable states (Fig. 7c). For inactive and I1 states where the binding pocket volume is small, the docking is hard due to the steric hindrance of the N-terminus. In the other four states, docking results show variational docking affinity, with I2 states having the highest affinity for all the CB 2 selective ligands compared to the active state. Therefore, in the CB 1 active state, these ligands have a less binding probability. This differential binding affinity is explained by the differential binding pose of the ligands in I2 and active states. In I2, there is a larger gap between TM2 and ECL2, which allow the ligands to bind between the space of ECL2, TM1, and TM2 ( Supplementary  Fig. 24a, b). In the active state, the ligands bind deep in the pocket due to the movement of the toggle switch. Toggle switch movement creates a space for ligand to bind (Fig. 8a, c, e, g).
To  Table 6). To show the difference in binding positions of the ligands, we computed the stable dynamic contacts and corresponding interaction energies between the ligand and receptor in both the metastable states of the receptors. Calculations of the dynamic contacts show that there are major differences in stable contacts between the two states of CB 1 (Supplementary Fig. 27). In active state of CB 1 , the ligands are bound deeper inside the pockets. Hence, in the active state, the ligands are forming major interactions with the TM5 residues. The interaction energy contributions for the residues reveals the important residues in different metastable state (Supplementary Fig. 28). K192 3.92 contributes to CB 1 I2 state binding for every ligand, whereas W279 5.43 contributes distinctively to CB 1 active state binding for JWH-133, JWH-105 and AM1241.
On the other hand, the docking affinities for these ligands are similar to each metastable state for CB 2 as the receptor pocket shape remains relatively similar (Figs. 7d, 8b, d, f, h).  Table 6). As the ligand binding poses are relatively similar in distinct metastable states of CB 2 , the differences in number of stable contacts and corresponding energy contributions are also small compared to CB 1 (Supplementary Fig. 29 and Supplementary  Table 7). Therefore, these analyses show that how the difference in pocket shape due to distinct activation mechanism of CB 1 and CB 2 may lead to ligand selectivity towards CB 2 .
Conclusions CB 2 selective agonists are emerging as potential drug targets for treating inflammation. A number of CB 2 selective agonists have entered clinical trial phase 36,37 . However, none of these molecules has been approved as a drug by FDA. There are two of the major drawbacks in developing CB 2 selective drugs. First, high lipophilic nature of the CB 2 selective agonists are not suitable to be drug target. Second, some CB 1 activity may retain in those molecules which leads to off-target side-effect. In this work, we mechanistically explain the selectivity of CB 2 selective agonists by hypothesizing that reason for selectivity lies in the difference of activation mechanism. Here, we have studied the activation mechanism of both CB 1 and CB 2 by with combinations of using molecular dynamics, Markov state model and neural network.
Our results showed three major distinctions in conformational changes in both receptors. (1) In the extracellular region due to the movement of the N-terminus, we observe a large change in the binding pocket volume during the activation of the CB 1 , whereas, for CB 2 there are hardly any changes in the binding pocket. (2) For CB 1 , toggle switch W 6.48 undergoes translational changes during the activation mechanism, which lead to the changes in the intracellular matrices. While, for CB 2 , toggle switch W 6.48 only rotates in the same position. Our analysis also shows that toggle switch rotation does not effect the movement intracellular movements for CB 2 . (3) Intracellular TM5 movement of CB 2 affects the TM6 and TM7 movement critical for receptor activation. On the other hand, no major conformational change is observed in intracellular TM5 of CB 1 .
VAMPnets analysis further reveals the metastable states and sequential transition of the structural features during the activation. During the CB 1 activation, initial extracellular movements of N-terminus, TM1 lead to the change in further change in the transmembrane and intracellular features. Conversely, due to lack of the extracellular changes, the activation of CB 2 is kinetically faster compared to the CB 1 .
Further analyses show that due to the lack of the extracellular movement during CB 2 activation, binding pocket volume remains relatively same for CB 2 . Docking of metastable states show that CB 2 selective agonists show the similar high binding affinities for each metastable states of CB 2 , depicting the preferential binding of these ligands in the generalized CB 2 binding conformation. Whereas, these ligands can only be selectively bind to few metastable states in CB 1 due to the larger conformational change in the CB 1 binding pocket. This explains lack of overall binding affinity of these ligands towards CB 1 . Overall, we provide mechanistic knowledge about cannabinoid receptor selectivity by studying the activation mechanism. This study will guide to design new CB 2 selective agonists with better druggability profile by acting as virtual screening criteria for these ligands.

Methods
System preparation. To perform apo (without ligand) simulation, we started molecular dynamics simulations from inactive (CB 1 inactive state PDBID: 5TGZ 9 ; CB 2 inactive state PDBID: 5ZTY 12 ) and active (CB 1 active state PDBID: 5XRA 11 ; CB 2 active state PDBID: 6KPF 16 ) state structures of CB 1 and CB 2 . Non-protein residues were deleted from the CB 1 and CB 2 crystal structure. The thermostabilized mutations were mutated back to original residue using tleap. Hydrogen atoms were added to protein amino acid residues using reduce command of AMBER package 53 . To add consistency in the residue numbering, 4 residues (CB 1 residue number [99][100][101][102][103] were added in the truncated N-terminus for CB 1 active structure. Intracellular region between TM5 and TM6 was truncated for both CB 1 and CB 2 . Terminal end of N-terminus and C-terminus & unconnected residues of TM5 and TM6 were neutralized with ACE and NME residue. Both CB 1 and CB 2 was embedded in POPC bilayer with 150 mM NaCl solution in both extracellular and intracellular direction. TIP3P water model was used for MD system as this water model gives better performance in ref. 54 . AMBER forcefield was selected to Fig. 7 Binding pocket volume and docking calculations for metastable states of cannabinoid receptors. Binding pocket volumes for each metastable state of CB 1 (a) and CB 2 (b) are plotted against N-terminus distance as a scatter plot. Docking affinity (kcal/mol) of the four CB 2 selective agonists for each metastable state of CB 1 (c) and CB 2 (d) are plotted as a bar plot. Docking and volume calculations were performed on 100 structures from each metastable state selected based on the MSM probabilities. Colors for CB 1 metastable states change from orange to blue gradually from inactive to active state. Colors for CB 2 metastable states change from green to magenta gradually from inactive to active state. For each protein structure, docking affinity is obtained as an average of top three dock poses. Error shown in docking calculations is the standard error which the standard deviation of docking affinity distribution obtained from 100 structures divided by ffiffiffiffiffiffiffiffi 100 2 p .
parameterize the MD systems as the this forcefield was used to perform simulations for other GPCRs and all the necessary parameters for molecules in MD system are available in this forcefield [55][56][57] . AMBER ff14SB and lipid17 forcefield was used for protein and lipid parameterization 53,58 . Agonist bound active (CB 1 agonist: AM11542, CB 2 agonist: AM841) and antagonist inactive (CB 1 antagonist: AM6538, CB 2 antagonist: AM10257) holo simulation were also performed. GAFF forcefield parameters for the ligands are obtained using antechamber 59,60 . Similar modifications were made in the proteins using tleap and embedded in bilayer using CHARMM-GUI 61 .
Simulation details. Atomistic unbiased molecular dynamics simulation was performed with AMBER v18 MD engine 62 . Before running the production MD run, prepared MD systems were minimized and equilibrated. Minimization was performed for 15,000 steps; where the systems were minimized for 5000 steps with gradient descent algorithm and followed by conjugate gradient algorithm for the rest 10,000 steps. Minimized systems was heated and pressurized sequentially in NVT and NPT ensemble, respectively. Heating step was performed for 2 ns where system was heated from 0 K to 300 K in NVT ensemble. Anisotropic pressure was applied to the heated systems to reach the constant pressure of 1 bar with NPT ensemble. Pressurization step was also performed for 2 ns. Berendsen thermostat and barostat was used to fix the temperature and pressure 63 . During the heating and pressurization, protein backbone for each system was restrainted using a spring force with a spring constant of 10 kcal/mol/Å 2 . Subsequently, restraint was removed and each system was equilibrated for 46 ns for collecting data. Numerical integration was performed using verlet algorithm to update velocity and coordinate. Time step for each numerical integration is 2 fs. Shake algorithm was implemented by constraint bond movement of the hydrogens to bonded to heavy atoms 64 . Periodic boundary condition was applied to each MD system. Particle mesh Ewald method (PME) was implemented for long-range electrostatic calculation 65 .
Featurization of protein conformational ensemble. To perform adaptive sampling and markov state model (MSM) building (Discussed in the following sections), protein conformations in MD system need to be represented by the descriptors that can capture protein conformational changes. To find the descriptors, we have implemented residue-residue contact score (RRCS) analysis 66,67 . Using RRCS method, a score was assigned to each residue pair distances for a particular protein conformation which are more than one helical tern away. This score was calculated based on the every atom pair distances between two residues following the same algorithm of Zhou et al. 66 . To find the major structural changes during activation for CB 1 and CB 2 , we compared the change in RRCS between the active and inactive structure. α-carbon distances of the residue pairs, where the RRCS is changing more than 3, were used as a feature to describe a protein conformation (Supplementary Tables 8, 9). As shown in Supplementary  Fig. 30, these descriptors can capture the extracellular, transmembrane and intracellular changes for both CB 1 and CB 2 .
Adaptive sampling. Previous experimental and computational studies depict that GPCR activation occurs in a microsecond to millisecond timescale 20,68-70 . However, each numerical integration time step for MD simulation is of femtosecond timescale 71 . Therefore, capturing the entire activation process by running a single long trajectory is time consuming process. Furthermore, to capture the thermodynamic and kinetics of the conformational changes better sampling is needed in high energy transition states 72 . Thus, to capture the entire protein conformational ensemble, adaptive sampling protocol was implemented [72][73][74][75] . Adaptive sampling is a well established sampling technique used for studying ligand binding [76][77][78] , protein conformational change 67,68,[79][80][81] and ligand selectivity 82 . First, conformations obtained from MD simulations are expressed in terms of collective variables. Second, collective variable space are clustered into different states using k-means clustering. Third, based on the population of the each state, MD frames are selected from least populated clusters to start simulation for the next round. GPCRs consist of the multiple allosterically coupled conversed motifs distributed in extracellular, transmembrane and intracellular regions. These motifs structurally rearrange themselves when protein conformational change happens from inactive to active. Therefore, finding suitable adaptive sampling matrices to minimize the amount of simulation time is difficult. Here, we used our preexisting knowledge of the structural changes to select CVs and modify the CV based on the requirement. For initial round of sampling, we selected the extracellular TM1-TM6 distance and intracellular TM3-TM6 distance as a collective variable for adaptive sampling. For both CB 1 and CB 2 , sampling was performed from both active and inactive structures until energy landscape was connected. Next, 24 and 20 distinct features (Extracellular distances, Toggle switch movements, Intracellular TM6 and TM7 movements) were selected for CB 1 and CB 2 to improve the speed of the sampling (Supplementary Tables 10, 11). To check whether simulations able to observe the transition between the inactive to active states in the RRCS descriptor space (as discussed in previous section), we performed dimensionality reduction using time independent component analysis. On the projection of slowest two tic components, transition between active and inactive state was observed in CB 2 conformational ensemble. However, there was a disconnect between active and inactive minima in tic projection of CB 1 . Therefore, additional rounds of sampling were performed from the clusters that are closest from each other in the two disconnected regions to observe the transition between active and inactive minima is observed for CB 1 .
In total, 490 and 278 μs simulations were performed to capture apo activation process for CB 1 and CB 2 ( Supplementary Fig. 34a, b). Adaptive sampling for holo systems is performed on the 24, 20 structurally known features for CB 1 and CB 2 holo systems for approximately 20 μs.
Markov state models. From adaptive sampling, we generate plethora of short trajectories that only capture the small changes in protein conformation. Markov state model was developed to connect the information from individual trajectories to by building transition network between distinct states [83][84][85] . In markovian processes, the future step of the process should only depend on present states, not on the past states 86 . Molecular dynamic simulations follow the same principle as the position and velocity of the future step is calculated from the energy and force calculation based on the current state. From MSM, we obtain stationary probability distribution of the protein ensemble and timescales of the slowest process by calculating transition probability matrix (TM). TM calculates probability of transition between discretized protein space by calculating the jump between different states after lag time. Lag time is the minimum time at which markovian property of the discretized states is valid. Building markov state model consist of four different steps: featurization, dimensionality reduction, clustering and hyperparameter optimization. Featurization of the protein conformational space was done from the distances calculated from RRCS (discussed in previous subsection). Dimensionality reduction was performed on the featurized space with time independent component analysis (tICA) 87 . MSM captures the slowest timescale processes. Using tICA, we obtain new representations of features which is a linear combination of slowest features. Therefore, in the next discretization step, clustering algorithm can discretize the protein space which are distinct in slowest timescale. K-Means clustering algorithm was implemented in this case. The hyperparameters that needs to be optimized are lagtime at which markovian property of the discretized space is valid, number of tic dimensions on which clustering is performed and number of cluster components or microstates that can distinguish the slowest components. With a minimum lag time of 25 ns for both the systems, slowest timescales obtained from MSM converges, atleast in the log scale ( Supplementary Fig. 31a, b). Hence, the markovian property is valid. For further testing of markovian property of models at lag time of 25 ns, Chapman-Kolmogorov test (C-K test) was performed ( Supplementary Fig. 32a, b) 88 . To optimize other two hyperparameters, the VAMP-2 score was calculated 89,90 . For reversible process, which follows the detailed balance, VAMP-2 score is the sum of the squares of the highest eigenvalues 91 . MSM with the highest VAMP-2 score better captures the slowest timescale processes. To avoid overfitting for calculation of the VAMP-2 scores, 10fold cross validation process was used. The hyperparameters which maximizes the VAMP-2 scores are considered for the final MSM estimation ( Supplementary  Fig. 31c, d). Final optimized MSM reweighed the conformational populations obtained from adaptive sampling ( Supplementary Fig. 33a, b). Pyemma v2.5.6 software package was used for entire MSM building process 92 .
Trajectory analysis. Features such as distances, angles are calculated using python library MDtraj v1.9.8 93 . VMD v1.9.3 software package was used for protein figure making and trajectory visualization 94 . Ambertools CPPTraj v18.01 was used for frame selection and trajectory modulations 95 . All the analysis codes (e.g. conditional probability calculation) were written in python programming language and matplotlib library was used for plotting the figures as a graph. POVME v3.0 software package was used for volume calculation of the binding pocket 96 .
Metastable state estimatation using VAMPnets. VAMPnets is a deep learning architecture used for building coarse grain MSM 48,91,97 . This architecture consists of two parallel deep learning networks which take the input feature values at time t and t+τ, where τ is the lagtime. This deep learning model is optimized by maximizing the VAMP-2 score. The output of the VAMPnets is the probability of the each frame belonging to a certain metastable state. VAMPnets was implemented using python library deeptime v0.4.3 98 . RRCS calculated features were as an input. Lagtime for the VAMPnets is selected as 25 ns. The number of the metastable states for each system was selected by the two creteria : (1) Each metastable state should have atleast 4 percent of the total population ( Supplementary Fig. 10, 11). (2) error bar in the timescale calculations is minimum ( Supplementary Figs. 8, 9). Based on these creteria, the number of metastable states chosen for CB 1 and CB 2 is six.
Data-driven analysis. To capture the major structural changes between different macro states, closest heavy carbon atom distances was calculated between every residue pair of each frame. These calculations were performed on 1000 frames from each metastable state. These frames are selected based on the the probability of microstates belonging to the metastable states. Distance distribution between certain residue pair is compared to the identical residue pair distance distribution between distinct metastable states. Comparison is performed with symmetric K-L divergence analysis. To estimate the contribution of each residue, average K-L divergence of the all the residue pair distances was calculated that are associated with that residue 49 .
Allosteric pathway calculation. To predict the allosteric pathway in different metastable states, similar procedure as explained in Bhattacharya and Vaidehi 51 . First, normalized mutual information for each residue pair was calculated per metastable state based on dihedral angle movement. These calculations was performed on the 1000 frames as discussed in Data-driven analysis section. Based on the residue pairs, where closest heavy atom are within 5 Å, an undirected acyclic graph was created. Nodes in the graph denote a residue and edges are between the two residues which are within 5 Å of one another. Edge weight represents the MI max -MI ij , where MI ij is the mutual information between the two nodes i and j. Shortest allosteric path was calculated for the residue pairs that are atleast 12 Å away from each other. Among all the calculated paths, 500 paths with highest mutual information were selected. These paths were clustered into optimal number of clusters based on the procedure explained in Bhattacharya and Vaidehi 51 . Mutual information calculations have been performed with MDentropy v0.3.0 software package 99 .
Docking calculations. To find the docking pose and the docking score of the CB 2 selective ligands, in distinct metastable states of CB 1 and CB 2 , four CB 2 selective ligands were docked into binding pocket of distinct metastable states of CB 1 and CB 2 . Four ligands were selected based on the criterion that those ligands show at least 10 fold more affinity for CB 2 than CB 1 32 . Ligand structures were downloaded from pubchem using sdf format which is converted into the pdb format using Charmm-gui ligand builder 61,100 . Amber tools Antechamber is used for converting the pdb format to mol2 format 59,60 . For docking, mol2 files were converted to pdbqt format. To represent each metastable state, 100 protein structures were selected from each metastable state proportional to the probability of microstates and converted to pdbqt format. Autodock-vina v1.2.3 software was used for docking calculations 101,102 . To specify the docking center, center of mass of alpha-carbon of binding pocket residues for CB 1 and CB 2 was selected ( Supplementary Fig. 35).
Transition path theory. Transition path theory was implemented to calculate the effective timescale between metastable state transition 103,104 . Effective timescale between two metastable states (A and B) is obtained from mean free passage time (1/k AB ) as shown in equation 1/k AB =(τ ∑ m i¼1 π i ð1 À q þ i Þ=F, where F is the total flux between metastable state A to B and q þ i is the committer probability which is defined as the probability of microstate i reaching the metastable state B before A. A microstate is assigned to a metastable state based on the highest probability of belonging to each metastable state. Thus, six metastable states are defined as distinct microstates. TPT calculations were performed using Pyemma v2.5.6 software 92 .
Error calculation. To calculate errors for thermodynamic and kinetic quantities obtained from simulation, bootstrapping approach was followed 81,88 . For a certain calculation, N numbers of random bootstrap samples were generated with 80% of total trajectories. Based on the standard deviation of a calculated quantity from bootstrapped samples, error bars were generated. For example, for error calculation in free energy plots, MSM was created for 200 times using 80% of the original data to find the standard deviation in the probability density.
Kinetic Monte Carlo simulation. Based on the transition matrix of MSM, kinetic monte carlo (kMC) simulation was implemented to observe the evolution of features with time 68,105 . kMC simulation is a stochastic process where in a given time step, next transition is selected based on the probability of the all possible transitions. The algorithm for KMC simulation is as follows: (1) Simulation starts at a particular microstate (obtained from clustering; as discussed in earliar section). (2) From the MSM transition matrix, the probabilities of all possible transitions (P i,j ) from that microstate are obtained. As the sum of all transition probabilities is 1, a cumulative distribution (S) of possible transitions is created where S i;j ¼ ∑ k¼j k¼0 P i;k . A random number (R) is generated between 0 to 1 to find the next possible transition. If R is between S i,j and S i,j+1 , ith state to j + 1th state transition is selected and simulation moves forward by lag time (τ). This algorithm is followed iteratively. In this case, kMC simulation was initialized from the inactive state of both the proteins.
Reporting summary. Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Data availability
All the necessary files (MSM object, bootstrap files, feature files) and trajectories (apo and holo simulation) that have been used for the analysis to generate figures can be obtained from https://uofi.box.com/s/jzooa0o27z1w9ha0h6va3i51ir7l38j4.